Using Learned Policies in Heuristic-Search Planning

نویسندگان

  • Sung Wook Yoon
  • Alan Fern
  • Robert Givan
چکیده

Many current state-of-the-art planners rely on forward heuristic search. The success of such search typically depends on heuristic distance-to-the-goal estimates derived from the plangraph. Such estimates are effective in guiding search for many domains, but there remain many other domains where current heuristics are inadequate to guide forward search effectively. In some of these domains, it is possible to learn reactive policies from example plans that solve many problems. However, due to the inductive nature of these learning techniques, the policies are often faulty, and fail to achieve high success rates. In this work, we consider how to effectively integrate imperfect learned policies with imperfect heuristics in order to improve over each alone. We propose a simple approach that uses the policy to augment the states expanded during each search step. In particular, during each search node expansion, we add not only its neighbors, but all the nodes along the trajectory followed by the policy from the node until some horizon. Empirical results show that our proposed approach benefits both of the leveraged automated techniques, learning and heuristic search, outperforming the state-of-the-art in most benchmark planning domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discrepancy Search with Reactive Policies for Planning

We consider a novel use of mostly-correct reactive policies. In classical planning, reactive policy learning approaches could find good policies from solved trajectories of small problems and such policies have been successfully applied to larger problems of the target domains. Often, due to the inductive nature, the learned reactive policies are mostly correct but commit errors on some portion...

متن کامل

Learning Weighted Rule Sets for Forward Search Planning

In many planning domains, it is possible to define and learn good rules for reactively selecting actions. This has lead to work on learning rule-based policies as a form of planning control knowledge. However, it is often the case that such learned policies are imperfect, leading to planning failure when they are used for greedy action selection. In this work, we seek to develop a more robust f...

متن کامل

Learning Generalized Reactive Policies using Deep Neural Networks

We consider the problem of learning for planning, where knowledge acquired while planning is reused to plan faster in new problem instances. For robotic tasks, among others, plan execution can be captured as a sequence of visual images. For such domains, we propose to use deep neural networks in learning for planning, based on learning a reactive policy that imitates execution traces produced b...

متن کامل

Kernel Regression for Planning Heuristics

Modern automated planning revolves around forward state-space heuristic search. These planners guide their search with powerful domain-independent heuristics that estimate the distance to a goal state by efficiently solving related, but simplified, planning problems. This approximation trades off efficiently with approximation quality. In the past 15 years, the Planning and Learning community h...

متن کامل

Scaling up Heuristic Planning with Relational Decision Trees

Current evaluation functions for heuristic planning are expensive to compute. In numerous planning problems these functions provide good guidance to the solution, so they are worth the expense. However, when evaluation functions are misguiding or when planning problems are large enough, lots of node evaluations must be computed, which severely limits the scalability of heuristic planners. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007